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Abstract 

Measures of the direction and strength of the interdependence between two time series are eval- 
uated and modified in order to reduce the bias in the estimation of the measures, so that they give 
zero values when there is no causal effect. For this, point shuffling is employed as used in the frame 
of surrogate data. This correction is not specific to a particular measure and it is implemented 
here on measures based on state space reconstruction and information measures. The performance 
of the causality measures and their modifications is evaluated on simulated uncoupled and coupled 
dynamical systems and for different settings of embedding dimension, time series length and noise 
level. The corrected measures, and particularly the suggested corrected transfer entropy, turn out 
to stabilize at the zero level in the absence of causal effect and detect correctly the direction of infor- 
mation flow when it is present. The measures are also evaluated on electroencephalograms (EEG) 
for the detection of the information flow in the brain of an epileptic patient. The performance of 
the measures on EEG is interpreted, in view of the results from the simulation study. 
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I. INTRODUCTION 



The interaction or coupling between variables or sub-systems of a coniplex dynamical sys- 
tem is a developing area of nonlinear dynamics and time series analysis l2|. The detection 
and characterization of interdependence among interacting components of complex systems 
can give information about their functioning and a better understanding of the system dy- 
namics. Information flow is a ubiquitous feature of many complex physical phenomena, such 
as climatic processes , electronic circuits [5(|, financial markets [61], and the brain system 

3,3. 

Given a set of time series observations, it is essential to assess whether they originate 
from coupled or uncoupled systems, estimate the hidden causal dependencies between them 
and detect which system is the driver and which is the responder. Granger causality has 
been the lead.ng approaeh for a long time for inferrmg the d,reet,on of .nteraefons. based 
on the predictability of time series using linear models |9|. If the prior knowledge of a 
time series improves the prediction of another, the former Granger-causes the latter. Many 
measures have been developed based on the concept of Granger causality using cross-spectra 



and cross- prediction of linear models [lO|, Granger causality has been extended to 

incorporate also nonlinear relationships using nonlinear models jl2[ [isl. or other model- free 
measures that exploit nonlinear properties of dynamical systems, such as measures based 
on phase and event synchronization nE3.reconstr„etion of the state spaces 
and information theory I201 24 ] . The information measures make no assumptions on the 
system dynamics as opposed to phase or event synchronization measures that assume strong 
oscillatory behavior or distinct event occurrences, respectively, and the state space methods 
that require local dynamics being preserved in neighborhoods of reconstructed points. 

Comparative studies on different causality measures reported in 25l-l28j are not conclusive 
and do not point to the same measures, also because different measures are used in each 
study. In a recent review and evaluation of state space, synchronization and information 
causality measures, we stressed the need to render the statistical significance of the causality 
measures as most measures are biased and indicate causal effects when they are not present 



29|. A thorough investigation for the validity and usefulness of a causality measure should 



start with a test of significance, i.e. a measure should not identify coupling (or interaction) 
in any direction when it is not present. In statistical terms, this means that the actual 
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probability of rejection of the null hypothesis when it is true does not exceed the nominal 
significance level, usually set to 0.05. As in any statistical test, power is of interest after the 
correct significance is established, where power here regards the sensitivity of the causality 
measure in detecting interaction and identifying its direction. Some approaches have been 
proposed to render significance of the coupling measures using the concept of surrogate data 
testin g. T he so-called effective transfer entropy uses a random shuffling of the driving time 
series |22|. Twin surrogates, generated as shadowing trajectories of the original trajectories, 
have recently been suggested to preserve the original individual dynamics 30|. Apparently, 



the closeness of shadowing in the twin surrogates determines the level at which the coupling 
is destroyed. A different and simple way to generate surrogates is to time-shift the one of 



the two time series, as suggested in 



13|- 



We propose here a different generation of surrogates and shuffle randomly the recon- 
structed points of the driving time series, rather than the samples as done for the effective 
transfer entropy. The random shuffling destroys completely the coupling and the use of the 
reconstructed points preserves the individual system dynamics, perhaps not in the same way 
as by the twin or time-shifted surrogates. Instead of making a formal surrogate data test, 
we use these surrogates to correct the measure and reduce the bias. The performance of the 
measures of mean conditional probability of recurrence 18|], transfer entropy 20| and sym- 



bolic transfer entropy 



23| . as well as the respective surrogate-based corrections is assessed 



on multiple realizations of uncoupled and coupled nonlinear systems (maps and flows) for 
a range of increasing coupling strengths. In the numerical simulations, the detection of the 
coupling directionality and strength is evaluated at different settings of dynamics complex- 
ity, time series length, noise level and embedding dimensions for the reconstruction of the 
two state spaces. All these factors can be sources of bias in the estimation of the causality 
measures. 

The structure of the paper is as follows. The directional coupling measures considered in 
this study are briefly presented in Sec. [Tll and the proposed corrections of the measures in 
Sec. mil The results of the application of the measures and their corrections on simulated 
systems are discussed in Sec. IIVI and on a real application of EEG recordings from epileptic 
patients in SecJVl Finally, the conclusions are drawn in Sec. I VII 
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II. CAUSALITY MEASURES 



Let {xt} and {yt}, t = l,...,n, denote two simultaneously observed time series de- 
rived from the dynamical systems X and Y, respectively. We formulate the causality 
measures for the causal effect of system X on system Y, denoted as X Y. For the 
opposite direction Y ^ X the formulation is analogous. Let nix and my be the embed- 
ding dimensions and and Ty the delays for the state space reconstructions of the two 
systems, respectively, giving the reconstructed points = {xt,Xt-T^, . ■ . ,Xt-(m^-i)T^y and 

= {yt, Vt-Ty, • • • , Vt-irny-iyy)', whcrc t = 1, . . . , n' and n' = n-max{(m^.-l)r^., {my-l)Ty}. 
The steps ahead or time horizon to address the interaction is denoted by h. 



A. Mean conditional probability of recurrence 



A state space causality measure based on recurrence quantification analysis 



recently introduced, termed the mean conditional probability of recurrence ISj. Let 



3l| has been 



= - l|x^ - Xjll), Rfj = Q{ey - ||y, - yj), i,j = l,...,n' 



be the recurrence matrixes of X and Y, respectively, where G(-) is the Heaviside function 
counting points with distance smaller than the predefined distance thresholds Ex and Sy, 
respectively. The joint recurrence matrix of {X, Y) is defined as 

Ji!j^ = - ||xi -Xj-||)e(£j/ - llYi - yJ), i,j = l,...,n', 

i.e. a joint recurrence occurs if the system X recurs in its own phase space and simultane- 
ously, the system Y recurs also in its own phase space. The mean conditional probability of 
recurrence (MCR) is defined as 

1 n' y^n' jX,Y 

MCRx^Y = -T % ■ (1) 

i=l l^j=l 

If X drives Y, then MCRx^y > MCRy^x- The concept of recurrence has been used to 
quantify a weaker form of synchronization, and MCR is an extension of it that detects the 
direction of coupling jl8 |. 
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B. Transfer entropy 



Transfer entropy (TE) quantifies the information flow from X to y by the amount of 
information explained in Y at one step ahead (or generally h steps ahead) by the state of 



X, accounting for the concurrent state of Y 20|. The concept of TE extends the Shannon 



entropy to transition probabilities and quantifies how the conditioning on X change the 
transition probabilities of Y . Using the reconstructed points for X and Y as given above, 
TE is defined as 

T^x^y = Ep(?/*+/.,x„y,)log^^^^^^^, (2) 

p{yt+h\yt) 

where p{yt+h,^t,yt), Piyt+h\^t,yt), and p{yt+h\yt) are the joint and conditional probability 
mass functions for a proper binning. The time horizon h is introduced here instead of the 
time step one that was originally used in the definition of TE. TE can also be defined in 
terms of entropies as 

TEx^Y = H{^t,yt) - Hiyt+h..^t..yt) + H^yt+h.y^) - H{y^). (3) 

Instead of binning, we define TE in terms of correlation sums as follows. Let X be a 
continuous, possibly vector-valued, random variable. For a fixed small r, the entropy of a 
variable X can be estimated as H{X) ~ In C(xt) + m In r [32^, where C(xt) is the correlation 
sum for the vectors Xf with embedding dimension m (C(xj) is an estimate of the probability 
of points being closer than r). The standardized Euclidian norm, i.e. the Euclidean distance 
divided by the square root of the embedding dimension, is used for the calculation of the 
correlation sum. Let us denote the correlation sums of the vectors , x^ , yj , y^, [xt,yj 
and [yt+h,yt] as C(yt+ft, x^, yj, C(yJ, C(xt,yJ and C{yt+h,yt), respectively. Then, TE is 
defined as 

rpT^ 1 C{yt+h,^t,yt)C{yt) 

TEx^Y = log —f V- v4) 

C{^t,yt)C{yt+h,yt) 

C. Symbolic transfer entropy 

Symbolic transfer entropy (STE) is the transfer entropy defined on rank-points formed 



by the reconstructed vectors of X and Y 23|]. Thus, for each vector y^, the ranks of 



its components assign a rank-point y^ = [ri, r2, . . . , r^^], where Vj G {1, 2, . . . , m^^} for 
j = l,...,my. Following this sample-point to rank-point conversion, the sample yt+h in 
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Eq.(|2]) is taken as a rank point at time t + h, Yi+h, and STE is defined as 

STEx^y = H{kt,yt) - ^(yt+ft,xt,yj + i7(yt+/,,yj - //(yj, (5) 
where the entropies are defined on the rank-points. 

D. Effective transfer entropy 



A modification of TE, called effective transfer entropy (ETE), was defined in 22| as the 
difference of TE computed on the original bivariate time series and the TE computed on a 
surrogate bivariate time series, where the driving time series X is randomly shuffled 

ETEx^y = TEx^y - TE^^^^^^^^y- 

The use of a randomly shuffled surrogate aims at setting a significance threshold in the 
estimation of TE. The approach of ETE can be used for the estimation of any other causal 
measure. Here, for the estimation of ETE, instead of one random permutation a num- 
ber of M random permutations of the driving time series X are considered and therefore 

TEv- , „ , -.y in the definition of ETE is replaced by the mean of the corresponding M 
shufned 

values 



ETE.^v- = TE,^, - - g TE, ^^^g,^^^., (6) 
where / = !,..., M. In the same way, we define effective STE, denoted ESTE. 

E. Relationship of MCR and TE 

MCR defines in a rather direct way the conditional probability of close points in Y 
given they are close in X, whereas most state space methods, such as nonlinear interdepen- 
dence measures 

atte.pt to app.o.,™ate t.. co„d..„a. p.obab.Ht, ind.ect.y 

through analogy in distances. To this respect, the MCR method is closer to information 
measures, such as TE. However, TE involves also transition probabilities that can give ad- 
ditional information about the effect of the driving system on the future of the response 
system. 

In 181], it is stressed that MCR needs smaller number of data points than TE. This is 



true if binning estimators are used for the probability functions in TE, but for the estimator 



considered here using correlation sums, the data requirements are the same, as the stabihty 
of the estimation in both MCR and TE rehes on having good statistics of points in the 
neighborhoods for a given distance. Certainly, this holds when the distance r in TE and 



the distances e^. and in MCR are at the same level. It is also mentioned in 



isj that TE, 



but not MCR, may give values larger than zero for both directions when the coupling is 
purely unidirectional. As we show below, this bias is not specific to a measure and should 
be attributed to other factors, such as the system complexity and the length of the time 
series. The fact that MCR did not exhibit this bias in the results in 18| may be due to 



the optimization of the values of the thresholds and ey, so that for no coupling both 
averages of the estimated probabilities of recurrences p(xj) and p(yj) are equal to 0.01. In 
our simulations we do not optimize and but use a fixed = ey = r and have the time 
series standarized, i.e. the same distance threshold is used in the computation of both MCR 
and TE in all simulations. 

III. MODIFICATIONS OF CAUSALITY MEASURES 

A main drawback of all causality measures considered in this study is that they do not 
provide stable and consistent results, particularly for weak coupling structures and noisy 
time series. The measures have bias that may be different in each direction, depending also 
on the dynamics of each system, the time series length and the state space reconstruction. 
The existence of bias and spurious detection of causal effects has been previously reported 



for different causality measures jl|, |25|, 



21, 



29| . When there is no causal effect the positive 



bias may be misinterpreted as weak coupling. 

A possible solution to this problem is provided by reducing the bias of the measure using 
surrogate data. Surrogate time series can be used to rule out spurious conclusions about 
the existence and the direction of coupling. When testing the null hypothesis that the two 
time series are uncoupled, the bivariate surrogates should replicate the dynamics of each 
system and be independent to each other. In this way, the bias due to the individual system 
dynamics and state space reconstruction is preserved in the surrogates. Here, we do not 
apply a formal surrogate data test, but we use the surrogate values to correct for the bias 
of the coupling measure, as shown below. 

The approach in ETE attempts to generate surrogates for this purpose by randomizing 
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the temporal structure of the driving time series, so that if the systems are coupled, cause and 
effect are lost. However, by randomly shuffling a time series, its self- dynamical structure is 
destroyed as well. We present below correction to the measures MCR, TE and STE, based on 
the frame of surrogates, which are extracted by randomly shuffling the reconstructed vectors 
of the driving time series in order to preserve the dynamical properties of each system. 



A. Corrected MCR 



The corrected MCR (CMCR) defined below aims at reducing the bias of MCR in case 
of no causal effects. Recall that the ones at each line i of the joint recurrence matrix J^ J , 
i,j = l,...,n', correspond to the matched time indices of the neighboring points to Xj 
in system X and the neighboring points to in system Y, and the number of matches 
determines the strength of coupling. Thus by random shuffling the lines of matrix Rfj, 
i, j = 1, . . . , n', we destroy this match, as for each y^, the neighbors for the X system do not 
regard any more Xj but another randomly chosen point. Repeating this random shuffling M 
times, we get M new matrices Rsf^j, I = 1, . . . , M, and M new joint recurrence matrices 
J Si - j . This allows us to take the average number of common neighbors at each time index 
i over the M realizations that regards the scenario of no coupling. The 'surrogate' MCR is 
then defined as 

MCRsx->r = - E M^i=i^3=i ^\^,j _ 

and CMCR is 

CMCRx^y = MCRx^y - MCRsx^y- (7) 



B. Corrected TE and STE 



For the estimation of the corrected TE (CTE), the same idea is implemented and we 
assume again M random shufflings of the points of the X system. Thus in the estimation 
of TE in Eq.(j4]), the terms of the correlation sums C{yt+h, ^t, yj and C(xt, yj are replaced 
by the corresponding mean values of the correlation sums estimated on the point shuffled 
surrogates, given as 

M 

Cs{yt+h,^t,yt) = -r7yC{yt+h,:)^tnyt) 
1=1 
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and 

M 

C,(xt,yJ = v7lIC'(xipyt), 

1=1 

where ti denotes a random time index and x^^ is the point in system X at the l-th rephcation 
for the time index t. Then, the 'surrogate' TE value is estimated as 

rpp,^ . Cs{yt+h,^t,yt)C{yt) 

ibSjf^y = log — -— - 

Cs{^t,yt)C{yt+h,yt) 

and CTE is defined as 

CTEx^Y = TEx^y — TEsx^y- (8) 
Note that instead of taking the average of M 'surrogate' TE values as in Eq.([n]) for ETE, a 
single 'surrogate' TE value is extracted by taking the average at each term of the expression 
of TE. The difference is actually in taking the average after or before the logarithm of each 
correlation sum. The former gives more variable estimates of TE on the surrogates as for 
small values of the correlation sums we obtain large negative logarithms. Thus by taking the 
mean of the correlation sums over all surrogates we stabilize the correlation sum to the most 
representative value expected if the systems were to be uncoupled. In turn this gives more 
stable estimation of the mean entropy terms and subsequently the mean transfer entropy 
for the surrogates. 

Substituting in Eq.® the expression for the original and surrogate TE, the terms H{y^) 

and H{yt+h,yt) cancel out and we get 

r^rj^j^ , C{yt+h,^t,yt)Cs{^t,yt) 
CTEx^y = log — — -. 

Cs{yt+h,^t,yt)C{xt,yt) 
This measure should be zero when X does not have any effect on Y. However, other sources 

of bias may still cause deviations from zero even in the lack of causal effect and this will be 

tested below through simulations. 

Corrected STE (CSTE) is defined analogously to CTE, and the expression of CSTE in 

terms of entropies is 

CSTEx^y = H{kt,y^) - H{yt+h,±t,yt) - ^s(xi, yj + x*, yj, (9) 

where 



and 



^^s(x„y,) = -E^(x*,,yJ (10) 
1=1 

^ M 

Hs{yt+h,^t,yt) = -TrY^Hivt+h.^ti^yt)- (n) 

1=1 
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IV. EVALUATION OF CAUSALITY MEASURES ON SIMULATED SYSTEMS 



A. Simulation Setup 

Measures of directional coupling are computed on 100 realizations of the following unidi- 
rectionally coupled systems, for increasing coupling strengths and for both directions X —^Y 
and Y ^X. 

• Two unidirectionally coupled Henon maps 
Xt+i = 1.4 — x'f + 0.3xt-i 

yt+i = 1.4 - cxtVt - (1 - c)y'f + 0.3yt-i 

with coupling strengths c = 0,0.05,0.1,0.2,0.3,0.4,0.5,0.6,0.7, whereas the onset to 
identical synchronization occurs at approximately c = 0.65 [l^ Il6| . 

• Two unidirectionally coupled Mackey-Glass systems 

dx 0.2xt_A, 



, — H in —O.lxt 
dt 1 + a;io^^ 

dy 0.2yt.^ , 0.20;^ 



+ c ;7'^\o^^ -O.lyt (12) 



dt 1 + yl\ 1 + 



for A J. and A^, taking the values 17, 30, 100 (all 9 combinations for the driving and 
response system) and with coupling strengths c = 0, 0.05, 0.1, 0.15, 0.2, 0.3, 0.4 and 
0.5. The choice of the different A^^^ and A^^ aims at investigating the performance of 
the measures on systems with the same and different complexity. For the three delay 
parameters the Mackey-Glass system is chaotic with correlation dimension about 2 
for Ax = 17, 3 for A^, = 30 and 7 for A^, = 100 3J]. The integration was done with 



the function dde23 of the MATLAB software and the time series were obtained with 
sampling time 4 sec. 



A coupled nonlinear stochastic system (see 35| ) 

xt = 3.4xt_i(l - zti)e-^?-i + OAeu 

yt = 3Ayt^i{l - y^_^)e-y''^ + 0.5xt_iyt_i + 0.4e2t 

zt = 3.42t_i(l - zl,)e-''-^ + 0.32/t_i + 0.5x?_, + 0.463* 

where e^, and 63* are standard white Gaussian noise processes. We note that the 
correct directed causal effects are X ^ Y, Y ^ Z and X Z. 
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The time series lengths for the coupled Henon maps are n — 512, 1024, 2048 and for 
the Mackey-Glass n — 2048. Gaussian white noise is also added to these time series with 
standard deviation 5% and 20% of the standard deviation of the time series. Further, we 
investigate the dependence of the measures on the state space reconstruction of the two 
systems. For the coupled Henon system, the embedding dimensions vary as = 1, . . . , 5 
and niy — 1, . . . , 5. For the coupled Mackey-Glass systems, rrix and niy vary from 1 and up 
to 10, depending on the delays of the coupled systems. For symbolic information measures, 
the embedding dimensions cannot be set equal to one as there will be no different symbolic 
patterns. 

In order to obtain quantitative summary results for the performance of the measures, 
t-tests for means are conducted for the following three null hypotheses Hq: 

• HJ: The mean of the measure in the direction X — > y is zero. 

• Hq: The mean of the measure for the direction F — > X is zero. 

• Hq! The means of the measures in the two directions are equal. 

We assume that the distribution of the measure in both directions formed from its values in 
100 realizations is normal and for Hq that they have the same variance, which both seem to 
be statistically satisfied (as resulted from the Kolmogorov-Smirnov test for normality and 
the Fisher test for equal variances apphed to some of the reahzations) . 

Using as samples the measure values from 100 realizations for each case, the performance 
of each measure is quantified in terms of the rejection or not of each of the three Hq at the 
significance level a = 0.05, giving a score zero if Hq is not rejected and one if it is rejected. 
So the total score for all three Hq ranges from to 3. There are two settings of interest 
for the coupling of X and Y: no coupling that regards the significance of the measure, for 
which the best total score is 0, and the presence of unidirectional coupling that regards the 
discriminating power of the measure, for which the best total score is 2, meaning rejection 
of Hq and Hq but not Hq (the latter yielding the direction of no causal effect). Note that 
score 2 can also be obtained if in both directions a measure is significantly different from 
zero but at the same level. Therefore, we will explicitly name the setting for each Hq when 
there is ambiguity from the total score. 
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B. Results on the unidirectionally coupled Henon maps 

MCR and CMCR measures are significantly affected by tlie embedding dimension, time 
series lengtli and noise level. There are two important problems with MCR: first, it in- 
creases also in the opposite direction (with no causal effect) with the increase of the coupling 
strength, which may erroneously be interpreted as bidirectional coupling, and second, it is 
positively biased in the uncoupled case (c = 0), especially for short time series. By increas- 
ing the time series length and the embedding dimensions, MCR and CMCR both decrease. 
The corrected measure CMCR obtains always smaller values than MCR in both directions 
and it is closer to zero for c = 0, particularly for small time series and small embedding 
dimensions (see Fig{T]). Both MCR and CMCR maintain a larger increase in the correct 



(a) (b) (c) 




0.2 0.4 0.6 0.2 0.4 0.6 0.2 0.4 0.6 



coupling strength coupling strength coupling strength 

FIG. 1: (a) Mean estimated values of MCR and CMCR for botfi directions from 100 realizations 
of tfie noise-free unidirectionally coupled Henon map, wliere irix = 2, niy = 2 and n = 512. In (b) 
and (c) as in (a), but for rux = 2, my = 4 and nix = 4, ruy = 2, respectively. 

direction of interaction with the coupling strength when nix = T^y, amplify the difference 
in the two directions when nix < and decrease this difference or even tend to suggest 
more interaction in the opposite direction when nix > i^y Addition of small noise level 
(5%) does not substantially affect MCR and CMCR, but 20% noise level lowers MCR and 
CMCR toward the zero level for both directions. In this case, increase of the embedding 
dimension regains the correct signature of coupling, but adds positive bias at a level that is 
clearly seen for c = 0. 

TE and mainly CTE are found to be more effective in detecting the direction of the infor- 
mation flow than MCR and CMCR. The causal effect is correctly detected in all simulations 
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on the Henon maps. The correct direction is preserved for all combinations of the embed- 
ding dimensions except for rux = 1 and niy > 1, where a spurious increase at the direction 
Y ^ X is observed. The CTE measure is always effectively zero for both directions when 
c = 0, whereas TE tends to be positive (at cases deviating significantly from the zero level) 
and ETE gives negative values and not equal in both directions. Also, ETE is affected by 
the selection of the embedding dimensions much more than TE and CTE (see Fig 12]). 



(a) (b) (c) 




coupling strength coupling strength coupling strength 



FIG. 2: (a) Mean estimated values of TE, ETE and CTE for both directions from 100 realizations 
of the noise- free unidirectionally coupled Henon maps, with n = 512 and = my = 2. (b) and 
(c) as in (a) but for = 2, niy = 4 and nix = 4, my = 2, respectively. 

The three information measures turn out to be robust to noise and the detection of the 
direction of interaction gets blurred only at few combinations of embeddings dimensions for 
high noise levels and short time series. This is because the variance of the estimated mea- 
sures increases with the embedding dimension and the noise level. Thus for small coupling 
strengths, the distribution of the measures in the two directions may overlap and suggest 
no discrimination in the two directions. 

Symbolic transfer entropy (STE) and its corrections (CSTE and ESTE) seem to be more 
affected by the selection of the embedding dimensions than the respective TE measures. In 
the presence of unidirectional coupling, STE and CSTE detect it correctly for > niy > 2, 
while ESTE is significantly affected by the embedding dimensions (see FigJHH and b). CSTE 
is the least sensitive to noise and gives the most consistent results in the case of no causal 
effects, whereas STE is positively biased and ESTE negatively biased (see FigJSb). The 
variance of the symbolic measures is small and does not seem to increase with the addition 
of noise as much as for the TE measures, so that the overlap in the two directions for small 
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(a) (b) (c) 




coupling strength coupling strength coupling strength 



FIG. 3: (a) Mean estimated values of STE, ESTE and CSTE for both directions from 100 realiza- 
tions of the noise-free unidirectionally coupled Henon maps, with n = 512 and nix = 'niy = 3. (b) 
As in (a) but for nix = 4, niy = 3. (c) As in (b) but for 20% noise level. 

coupling strength is much smaller. However, for small coupling strengths, estimated values 
of the symbolic measures from the two directions again overlap. 

The graphical results, as those shown in Figures [l]l3l are quantified by the score of the 
three statistical tests. When there is no coupling, the measures MCR and CMCR reject 
almost always Hi and and reject H^ when nix = "'^y, so they always score at least 2 (see 
Table H]). The reason for the rejections is that the measures are positively biased and have 
very small standard deviation, and apparently the proposed correction of MCR can neither 
eliminate this bias. Though the bias decreases with the increase of the time series length, the 
score for MCR and CMCR is still at least 2 and addition of noise does not change the score 
results. On the other hand, CTE scores for all combinations of embedding dimensions, 
even for as small time series lengths as n = 512, whereas TE does this only for large nix, 'my 
and ETE always scores at least 2. CSTE also often scores for nix, > 2, while STE and 
ESTE perform poorly, rejecting Hq and H^ mostly due to the positive and negative bias, 
respectively. The proper performance of CTE and CSTE is not substantially affected by the 
addition of noise. 

For the second setting of the presence of unidirectional coupling, the measures perform 
rather similarly. CMCR and MCR give scores 2 or 3, meaning that besides H^ also H^ is 
rejected, erroneously due to positive bias, and at cases Hq is rejected as well. Hq is often 
rejected by the information measures due to either positive bias (TE) or negative bias (ETE 
and sometimes CTE). ETE gives systematically negative values, while CTE might have a 
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TABLE I: Scores for the setting of no coupling from the 100 reahzations of the uncoupled Henon 
maps with n = 512, for noise-free data and 20% noise (the latter values are in parentheses). 





scores, 0% noise (20% noise) 


rux 


y 


MCR 


CMCR 


TE 


ETE 


CTE 


STE 


ESTE 


CSTE 


2 


2 


2(2) 


2(2) 


2(2) 


2(0) 


0(0) 


2(2) 


1(0) 


1(0) 


2 


3 


3(3) 


3(3) 


3(1) 


3(0) 


0(0) 


2(3) 


0(0) 


1(0) 


2 


4 


3(3) 


3(3) 


3(2) 


3(2) 


0(0) 


3(2) 


2(3) 


1(1) 


2 


5 


3(3) 


3(3) 


3(2) 


3(2) 


0(0) 


2(3) 


2(2) 


0(0) 


3 


2 


3(3) 


3(3) 


3(0) 


3(0) 


0(1) 


2(2) 


2(0) 


3(0) 


3 


3 


2(2) 


2(2) 


2(0) 


2(0) 


0(0) 


2(2) 


2(2) 


0(0) 


3 


4 


3(3) 


3(3) 


2(0) 


3(3) 


0(0) 


2(3) 


2(3) 


0(0) 

v / 


3 


5 


3(3) 


3(3) 


3(0) 


3(3) 


0(0) 


3(3) 


3(3) 


0(0) 


4 


2 


3(3) 


3(3) 


3(0) 


3(2) 


0(1) 


3(2) 


3(2) 


0(1) 


4 


3 


3(3) 


3(3) 


2(2) 


3(3) 


0(1) 


3(3) 


2(3) 


2(0) 


4 


4 


2(2) 


2(2) 


0(0) 


2(2) 


0(0) 


2(2) 


2(2) 


0(0) 


4 


5 


3(3) 


3(3) 


0(2) 


3(2) 


0(2) 


3(2) 


3(3) 


0(0) 


5 


2 


3(3) 


3(3) 


3(2) 


3(3) 


0(1) 


2(3) 


2(2) 


1(0) 


5 


3 


3(3) 


3(3) 


3(0) 


3(3) 


0(0) 


3(3) 


3(3) 


2(0) 


5 


4 


3(3) 


3(3) 


0(0) 


3(2) 


0(2) 


3(2) 


3(3) 


0(0) 


5 


5 


3(2) 


3(2) 


0(2) 


2(0) 


0(2) 


2(2) 


2(3) 


0(2) 



slightly negative mean at some cases, however its estimated values are around zero. CSTE 
turns out to outperform all the other measures and gives a proper score 2 (only Hq is not 
rejected) and it is also robust against noise, even for high level of noise (20%). 

C. Results on the unidirectionally coupled Mackey- Glass systems 

For the unidirectionally coupled Mackey-Glass systems, the MCR measures increase also 
in the opposite direction with the coupling strength for all embedding dimensions. The 
MCR and CMCR values are larger in the correct direction only for rrix < ruy, whereas 
for nix = fT^y they increase close together in both directions (see Figj4]). Addition of noise 
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worsens the performance of the MCR measures. It is noteable that for A^. = A^, MCR and 
CMCR point to the wrong direction of interaction. 




0.1 0.2 0.3 0.4 0.5 ° 0.1 0.2 0.3 0.4 0.5 
coupling strength coupling strength 



FIG. 4: (a) Mean estimated values of MCR and CMCR for both directions from 100 realizations 
of the noise-free unidirectionally coupled Mackey-Glass systems with = 30 and = 100, 
n = 2048 and = my = 3. (b) As in (a) but for rux = 4, ruy = 3. 

The transfer entropy measures are also significantly affected by the embedding dimension. 
The direction of the causal effect is generally detected with all measures when nix > rriy, 
but fails when m^. is much smaller than rUy. For example, for A^. = 30 and Ay = 100, all 
measures detect the correct driving effect when m^, > shown in FigJS^ for m^, = 4, 

rriy = 3, contrary to the MCR measures shown for the same setup in FiglDD. For systems 
with Aa; = Aj^, the detection is problematic even for rUx = rriy, and the stronger the 
coupling the larger rrix = niy is needed to be detected. This feature is shown in FiglSt and 
d for A^. = Ay = 17, where TEx->y > TEy_^x holds only for very weak coupling when 
rrix = my = 3 (see FigJSt), and in order to achieve TEx->y > TEy_^x also for stronger 
coupling rrix = rriy has to be increased to 5 (see FigJSli). In all cases, TE, CTE and ETE 
show the same signature (almost parallel lines in FigjS]), but CTE attains best the zero level 
at c = 0, whereas TE is slightly positively biased for certain embedding dimensions and 
ETE is negatively biased. Addition of noise does not change these structures but decreases 
their mean value and increase their variance, particularly for large embedding dimensions 
(see FiglHb). 

The symbolic transfer entropy measures depend on the embedding dimensions more than 
the transfer entropy measures, and fail more often to detect the correct causal effect, as can 
be seen from the comparison of the results on TE measures in Figj5^, b and c and STE 
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FIG. 5: (a) Mean estimated values of TE, ETE and CTE for both directions from 100 realizations of 
the noise-free unidirectionally coupled Mackey-Glass systems (A^; = 30, Ay = 100) with n = 2048 



and Tfix = 4 and ruy = 3. (b) As in (a) but for 



noise level, (c) and (d) As in (a) but for 



A., = A,, 



17, nix = fny = 3 and rux = my = 5, respectively. 



measures in FigJSlfor the same simulation setup. However, CSTE gives values around zero 
for any selection of embedding dimensions in case of no causal effects. 
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FIG. 6: The panels are as for the three first panels of Figl5]but for the symbolic transfer entropy 



measures. 
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Regarding the formal hypothesis tests, for the uncoupled Mackey-Glass systems, MCR 
and CMCR scored high in the first setting of no coupling and rejected almost always Hq 
and Hq for the Mackey-Glass system (see Table HTl for A^. = 17, Ay = 30 and nix = f^y)- 
CTE and CSTE scored overall worse than for the Henon map, but still better than their 

TABLE II: Scores for the setting of no coupling of the information measures, from the 100 reaUza- 
tions of the uncoupled Mackey-Glass system (A^,. = 17, Ay = 30) with n = 2048, for the noise-free 
case and for 20% noise level. 
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respective counterpart, TE, ETE and STE, ESTE, respectively. For example, for A^. = 30 
and Ay = 100, CSTE scored in most of the combinations of m^, and my, while the other 
information measures performed poorly scoring mostly 3 or 2. 

For all scenarios of complexity of the coupled Mackey-Glass systems, STE obtains sig- 
nificant positive values also for c = when rrix < ruy, while ESTE obtains often negative 
values. CSTE foUows the same dependence on c as STE but displaced so that CSTE falls 
at the zero level for c = 0. It is interesting that in the presence of noise, STE gets even 
larger values for c = and increases faster for c > 0, and CSTE has the same course with 
c as STE but starts at the zero level for c = 0. We illustrate this nice property of CTE 
and CSTE for c = as a function of m^., where nix = f^iy in Fig 171 First, we note that 
STE measures have much less variance than the respective TE measures and attain the zero 
level for small nix = f^iy (in FiglT^i only the distribution of CTE contains zero for varying 
TUx = 'my). In FigITt, where the systems have different complexity, ETE gets more affected 
by the individual system complexity as the embedding dimension increases, TE and CTE 
do not differ much in the two directions, but only CTE is at the zero level. 
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FIG. 7: (a) Mean TE, ETE and CTE (error bars denote the standard deviation) from 100 real- 
izations of the noise-free uncoupled Mackey-Glass systems with Ai = A2 = 17, n = 2048, and for 
varying rrix = my. Only the one direction is shown as the two systems are the same, (b) As in (a) 
but for STE, ESTE and CTE. (c) As in (a) but for Ai = 30, A2 = 100 and the two directions are 
shown, as given in the legend, with a measure at each panel. The drawn points and error bars for 
each measure are slightly displaced along the x-axis to facilitate visualization. 



In the presence of unidirectional coupling, MCR and CMCR score at least 2 as the 
first and second Hq are rejected, meaning that the measures are positively biased and the 
estimated values of the measures increase at the same level in both directions. The latter 
does not occur with the information measures and Hq is almost always rejected, as well 
as Hq (the correct direction of coupling). In this task, the corrected measures (CTE and 
CSTE) performed similarly to the original measures. 

The simulations on the coupled Mackey-Glass systems showed that CTE and CSTE 
improve the performance of the original measures, giving values closer to zero when the 
systems are uncoupled. Similarly to the coupled Henon maps, the optimal selection for 
embedding dimensions is rrix = my. For rrix > my, the dynamics of the driving system 
are over-represented giving larger TE and STE values in the correct direction, whereas for 
m^ < my the opposite effect is observed decreasing TE and STE for X — )■ "K and increasing 
TE and STE for Y ^ X, so that for very small m^ the measure values are even larger for the 
wrong direction Y ^ X. Though CTE and CSTE decrease the positive bias due to uneven 
representation of the systems when m^ 7^ my they cannot remove it completely. Another 
bias that cannot be vanished by the correction of the transfer entropy measures is due to the 
individual dynamics, which persist for m^ = my. The bias turns out to be larger when the 
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two systems have identical individual dynamics, i.e. A^. = Aj^. We found that for all three 
Ax = Ay values we tested for, only small coupling strength c could be detected correctly 
using small = niy and for larger c the embedding dimension should be increased with 
the complication that it may be too large for the given time series length. We attribute 
this to the similarity of the trajectories of the driving and response system, even when they 
are not in phase, so that a larger time window length from each trajectory is required to 
detect differences (in terms of entropies) that reveal the driving effect. When A^ ^ Ay, as 
driving increases the shape of the trajectories of the response system gets closer to that of 
the driving system and therefore the driving effect can be detected better even for small 
rrirc = my. Indeed this was the case for all combinations of A = 100 and any of A = 17, 30 
(at any order of driving and response). For the pair (A^ = 17, A^, = 30) the difference in the 
individual dynamics was smaller and the detection of the correct driving effect for c > 0.2 
required that = niy be as large as 6, while for the pair (A^. = 30, Ay = 17) even larger 
nix = was required. 

D. Results on the coupled nonlinear stochastic system 

For the coupled nonlinear stochastic system of three variables the driving effects X — )■ F, 
Y ^ Z and X ^ Z take place at lag one, so we expect that rrix = niy = 1 be sufficient 
for all pairs of variables. However, MCR gives larger values at the correct driving effect 
X — )■ y (meaning any of X — )■ F, F — > Z and X — )■ Z) for larger embedding dimensions 
i^x > i^y and with a large bias that decreases with the increase of time series length. CMCR 
reduces the MCR values but does not attain the zero level when evaluated for no coupling 
or opposite driving effect. 

TE is also positively biased for all embedding dimensions and one can only observe the 
correct driving from the relative difference in the two directions. Though TE decreases 
with the increase of time series length, it stays positive also for the opposite driving effect 
(for the pair (X, Y) of the stochastic systems see FiglHK for = niy = 1 and Fig|8t for 
nix = i^y = 2). CTE resolves this problem reducing the bias in both directions so that CTE 
for the opposite driving effect is at the zero level (FiglHb for mx = niy = 1 and FigJSJl for 
nix = niy = 2). Note that for nix = = 1 CTE reduces to ETE, whereas for nix = niy = 2 
ETE is different and goes negative obtaining smaller relative difference between the two 
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FIG. 8: (a) Mean TE (error bars denote the standard deviation) from 100 realizations of the coupled 



stochastic system vs time series length log2 for the variables X and y, and 



1. (b) 



As in (a) but for CTE. (c) As in (a) but for m^. 



2. (d) As in (b) but for CTE and ETE 



and 



m. 



y 



2. The drawn points and error bars for each measure are slightly displaced along 



the X-axis to facilitate visualization. 

directions as for the other systems. The same results are observed for the other two variable 
pairs. 



The situation with STE and its correction is similar to TE, but starting at 



2, 



with the only difference that for the pair (X, Z) the correct driving X ^ Z is less evident 
as for this case it is nonlinear and weaker. 



E. Comparison to other types of surrogates 

We compare the corrected measures defined in terms of random shuffling of the recon- 
structed points to other surrogates data schemes, i.e. twin surrogates jsol and time-shifted 



surrogates 



13|. We concentrate on the TE measure, but our simulations with STE pro- 



duced similar results. Measures using the twin or time-shifted surrogates are estimated as 
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the difference of TE on the original bivariate time series and the mean TE on M surrogates. 
CTE turns out to perform the same as for the two types of surrogates or even better at 
cases. Although all three measures (using shuffled reconstructed vectors, twin surrogates 
and time-shifted surrogates) establish the zero level for the direction Y ^ X, CTE is larger 
for X — > y for the whole range of c > 0. Representative examples are given in FigJHlfor the 
time-shifted surrogates and the Mackey-Glass system. We note here that twin surrogates 
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FIG. 9: (a) Mean CTE and TE from the time-shifted surrogates, denoted as tsTE (error bars 
denote the standard deviation) from 100 realizations of the noise-free Mackey-Glass systems with 
Ai = A2 = 17, n = 2048, and for nix = = 5. (b) and (c) are as (a) but for Ai = 17, A2 = 30 
and for Ai = 30, A2 = 100, respectively. The curves and error bars for each measure are slightly 
displaced along the x-axis to facilitate visualization. 

have the highest computational cost because of the long computation time in constructing 
the surrogates. 

V. APPLICATION TO EEC 

The measures considered in the simulation study are evaluated on two scalp preictal EEG 
records of 25 channels (system 10-20 with added low rows) and one intracranial EEG preictal 
record of 28 channels in a grid. We want to evaluate how the measures detect changes in 
the interactions of any pair of channels from the early to the late preictal state. The first 
extracranial EEG record is for a generalized tonic clonic seizure and the other for a left 
back temporal lobe epilepsy. No specific artifact removal method was applied but to attain 
better source derivation at small cortical regions, for each EEG channel, the mean EEG of 
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the four neighboring channels was subtracted 36| . The pairs of transformed EEG that were 



used for the estimation of the measures are: central left (C3) vs right (C4), temporal left 
(T7) vs right (T8), frontal left (F3) vs right (F4) and parietal left (P3) vs right (P4). For 
the intracranial EEG, the pairs of channels were either from the same brain area (two left 
frontal (LTP-1 vs LTP-3), two left temporal (LST-1 vs LST-3), two left occipital channels 
(ROT-1 vs ROT-3) and the same for the right side (RTP-1 vs RTP-3, RST-1 vs RST-3, 
ROT-1 vs ROT-3), or from opposite brain areas (a left and right frontal (LTP-1 vs RTP-1), 
temporal (LST-1 vs RST-1) and occipital channels (LOT-1 vs ROT-1). 

The data windows are from 4 hours to 3 hours before the seizure onset (early preictal 
state) and the last one hour before the seizure onset (late preictal state). Each one hour long 
data window is split to 120 successive non-overlapping segments of 30 sec and the causality 
measures are estimated for the channel pairs at each segment and for both directions. As 
the sampling frequency is 100 Hz, each 30 sec segment consists of 3000 data points. For the 
estimation of the measures, the embedding dimensions are nix = = 3 and rux = my = 5, 
the time horizon is h = 5, the lags are Tx = Ty = 5 and the radius is r = 0.2 times the 
standard deviation of the data. 

The causality measures indicate a bidirectional form of information flow among most of 
the brain areas, and this is present at both preictal states. Only few causality measures detect 
a slight change in the information flow between the early and late preictal state. The increase 
of mx,my from 3 to 5 decreases the measure values, as expected also from the simulation 
study (e.g. see FiglTOk and b for the MCR measures). Corrected measures, as expected, give 
lower values than the original measures. For the same example, CMCR shown in FigJTOb. 
drops to the zero level for most of the segments regardless of the state, which shows that 
the interaction observed by MCR may as well be attributed to bias in the estimation that 
originates from the individual system dynamics and state space reconstruction. A similar 
drop is observed for the corrected information measures, as shown in FigJTOld and e for TE 
and CTE, respectively. 

There is no consistent result from all measures for the direction of the interdependence. 
For example, for the pair of channels (C3, C4) and for the flrst seizure, MCR measures 
suggest that C3 drives C4, but not CMCR as shown in FiglTOl whereas the information 
measures indicate a bidirectional coupling. STE in particular, for rrix = my = 5, manifests 
an abrupt drop just shortly before the seizure onset for all pairs of channels (see FigJTTb 
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FIG. 10: (a) MCR profiles from both states (early and late preictal) of the first seizure and both 
directions, for channels as in the legend, with rrix = my = 3. (b) As in (a) but for m^. = my = 5. 
(c), (d) and (e) are as in (a) but for CMCR, TE and CTE, respectively. The preictal periods are 
indicated by the time in min, with reference to time at seizure onset and they are separated by 
a vertical dashed line. 

for channels C3, C4). CSTE renders this drop, giving values around zero for all times (see 

FigEb). 

For the second seizure of temporal lobe type, significant change in the interdependence 
between the two preictal states could not be observed, at least for the selected pairs of 
channels. Bidirectional coupling is suggested by the original measures at both states, whereas 
the corrected measures again give values around zero. TE and CTE were rather unstable, 
exhibiting large fluctuations across the successive segments of each preictal state. Moreover, 
they had computational problems and they could not always be calculated when = niy > 
5 (correlation sum contained zero terms due to lack of close neighboring points). 

Although intracranial EEG are less noisy, no clear indication of change in the causal 
effects between early and late preictal states could be observed as well. Corrected measures 
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FIG. 11: (a) STE profiles from both states (early and late preictal) of the first seizure and both 



directions, for channels as in the legend, with 



m,, 



5. (b) As in (a) but for CSTE. 



again gave values lower than the original measures (see Fig. [T2|) , and question the coupling 
detected by the original measures. 
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FIG. 12: (a) MCR profiles from both states (early and late preictal) and both directions of the 
intracranial data, for channels as in the legend, with nix = f^y = 5. (b), (c) and (d) are as in (a) 
but for CMCR, STE and CSTE, respectively. 
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VI. DISCUSSION 



The estimation of the strength and direction of interaction in coupled systems from 
hmited and possibly noisy bivariate time series was shown to encounter a number of problems 
regardless of the employed measure. It was shown that the coupling measures are affected 
by a number of factors, including individual system complexity, state space reconstruction, 
time series length and noise. These factors add bias to the estimation of the strength of 
coupling that may not be the same in both directions. 

We concentrated on reducing the bias in each component of the causality measures of 
mean conditional recurrence (MCR), transfer entropy (TE) and symbolic transfer entropy 
(STE). Since there are diverse sources of bias, we attempted to account for all of them by 
assuming the value of each component measure when the systems are not coupled. For 
this, we developed the idea of surrogate data and we randomly shuffled the points of the 
reconstructed state space trajectory of the driving system. Considering the decomposition 
of each measure to component quantities, for each component the average on an ensemble 
of realizations of the surrogate driving trajectory was computed and subtracted from the 
respective original component value. Replacing the corrected components in the expression 
of the coupling measure, some of the bias is removed. The proposed corrected measure 
indeed performed as expected in simulations, but the amount of reduction of the bias varied 
with the measure: for MCR the reduction with the corrected MCR (CMCR) was small in 
most simulations, whereas it was much larger in the application on epileptic EEC; for TE 
and STE the reduction was larger and in most cases effective, so that the corrected measures, 
CTE and CSTE respectively, were at the zero level in the absence of coupling. 

One could argue that it is intuitively more appropriate to compute first the coupling 
measure on the surrogate realizations and then take the average, instead of taking the 
average of the components in the measure expression. For example, in the computation of 
CTE, we take the average of the correlation sums in EqJH over all surrogates, while one 
would expect to take the average of the whole expression for TE, which would be equivalent 
to taking averages of the logarithms of the correlation sums. The latter gives more variable 
estimates of TE on the surrogates as for small values of the correlation sums we obtain large 
negative logarithms. Indeed our simulations showed that this version of CTE produces more 
varying results encountering also large negative values for some realizations. 



26 



The main advantage with the corrected measures is that they estabhsh significance, mean- 
ing that they do not indicate significant couphng when it is not there. This has been shown 
with all tested measures, CMCR, CTE and CSTE, and for varying conditions of system 
complexity (Henon maps and Mackcy-Glass of varying complexity), state space reconstruc- 
tion (a range of embedding dimensions), time series length and noise. For TE and STE, 
we considered also the so-called "effective" measures, denoted ETE and ESTE, respectively, 
which use a similar surrogate approach but the random shuffling is done on the samples of 
the time series. The simulation results showed that this approach gives varying estimation of 
strength and direction of coupling that often does not correspond to the real couphng. The 
use of twin surrogates or time-shifted surrogates gives the same or worse results compared 
to the suggested corrected measures. 

The performance of the measures was also assessed by statistical testing, where the 
samples for the test were the measure values on a number of realizations. CTE and CSTE 
were consistently found to be statistically insignificant in both directions in the absence of 
couphng, as opposed to the original TE and STE, as well as ETE and ESTE. In the presence 
of causal effect, CTE and CSTE could identify it with the same statistical significance as 
TE and STE, respectively. The correction of MCR also improved the statistical results, but 
not as clearly as for the information measures. Comparing CTE and CSTE, the simulations 
showed that CSTE was more dependent on the selection of the embedding dimensions but 
more robust against noise. 

TE, and subsequently CTE, have computational problems when the embedding di- 
mension is large, at least when correlation sums are used for their estimation, because 
stable statistics on neighborhoods within a given distance cannot be established when the 
state space dimension is large. This was found to be the case for the application to EEC 
when the embedding dimension was larger than 5, where TE and CTE fiuctuated a lot on 
successive segments of pairs of EEC channels. On the other hand, STE and CSTE were 
stable and in many cases CSTE provided values close to zero, whereas STE was always 
larger. CMCR also gave significantly reduced values compared to MCR, but not at the 
zero level. Interpreting these results in view of the simulation results, CSTE was the 
most conservative in giving evidence for coupling, but most reliable as well, so that when 
coupling was actually indicated by CSTE it would be likely to be true coupling. We could 
not find any clear evidence that there exists a particular spatial structure of coupling at 
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the different cortical regions we tested, or that there is a change of the couphng structure 
from early preictal to late preictal state, at least on the three EEG records we studied. 
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